Selecting a Small Subset of Informative Genes from Gene Expression Data by Using a Modified Binary Particle Swarm Optimisation
نویسندگان
چکیده
Gene expression technology, especially microarrays, can be used to measure the expression levels of thousands of genes simultaneously in biological organisms. Gene expression data produced by microarrays are expected to be useful for cancer classification. To select a small subset of informative genes for cancer classification, many researchers have analysed the gene expression data using various computational intelligence methods. However, due to the small number of samples compared with the huge number of genes (high-dimensional data), irrelevant genes, and noisy genes, many of the computational methods face difficulties in selecting the small subset. Thus, we propose a modified binary particle swarm optimisation to select a small subset of informative genes that are relevant for the cancer classification. In the proposed method, we introduce the particle speed and a rule for increasing the probability of bits in a particle’s position to be zero. The method was empirically applied to a suite of four well-known benchmark gene expression data sets. The experimental results demonstrate that the proposed method outperforms the conventional version of binary particle swarm optimisation (BPSO) and other related works in terms of classification accuracy and the number of selected genes. In addition, this method also produces lower running times compared to BPSO.
منابع مشابه
A Constraint and Rule in an Enhancement of Binary Particle Swarm Optimization to Select Informative Genes for Cancer Classification
Gene expression data have been analyzing by many researchers by using a range of computational intelligence methods. From the gene expression data, selecting a small subset of informative genes can do cancer classification. Nevertheless, many of the computational methods face difficulties in selecting small subset since the small number of samples needs to be compared to the huge number of gene...
متن کاملIdentification of Alzheimer disease-relevant genes using a novel hybrid method
Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...
متن کاملImproved Multiobjective Binary Biogeography Based Optimization using CVM for Feature Selection Using Gene Expression Data
Gene expression data play an important role in the development of efficient cancer diagnoses and classification. The genes identified are subsequently used to classify independent test set samples. The different feature selection methods are investigated and most frequent features are selected among all methods. This paper provides gene selection strategies for multiclass classification that ca...
متن کاملGreedy Search-Binary PSO Hybrid for Biclustering Gene Expression Data
As a useful data mining technique biclustering identifies local patterns from gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. In this paper a new method is introduced based on greedy search algorithm combined with the evolutionary technique particle swarm optimization for the identificati...
متن کاملGene selection using hybrid particle swarm optimization and genetic algorithm
Selecting high discriminative genes from gene expression data has become an important research. Not only can this improve the performance of cancer classification, but it can also cut down the cost of medical diagnoses when a large number of noisy, redundant genes are filtered. In this paper, a hybrid Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) method is used for gene selection...
متن کامل